- Title
- Mining numerical invariants for improving software reliability
- Creator
- Zhang, Bo
- Relation
- University of Newcastle Research Higher Degree Thesis
- Resource Type
- thesis
- Date
- 2022
- Description
- Research Doctorate - Doctor of Philosophy (PhD)
- Description
- Program invariants are conditions that can be relied on to be true during the execution of a program. This thesis aims to develop approaches to automatically mine numerical invariants from data produced by programs and use them to improve software reliability. We focus on two types of numerical invariant: the metamorphic relations (MRs) mined from program inputs and outputs, which can be used to assist bug detection in software testing; and the workflow relations mined from log data, which can be used to monitor software status and detect anomalies. For numerical invariants mined from program inputs and outputs, we propose a general method, AutoMR, to automatically infer and cleanse MRs. The proposed approach can infer both equality and inequality MRs, and MRs of linear, quadratic, and even higher degrees. AutoMR employs a general parameterization of arbitrary polynomial MRs and adopts the particle swarm optimization technique to search for suitable parameters. It also uses matrix singular-value decomposition and constraint-solving techniques to cleanse the MRs by removing redundancy. We apply the approach to 37 numerical programs and evaluate the fault-detection capacity of the inferred MRs. The results show that AutoMR can effectively infer various types of MR, which can be used successfully to detect faults in mutation testing and differential testing. For numerical invariants from program logs, we design two approaches, sADR and uADR, to handle semi-supervised and unsupervised scenarios, respectively. First, we propose a novel semi-supervised method, sADR, which requires a very small size of normal logs to extract the numerical invariants. Anomalies can be detected by evaluating whether or not the logs violate the mined invariants. Considering that labeling logs is time-consuming and tedious work, we design a novel minimal-rank-based sampling technique that takes advantage of the rank difference between the event-count-matrices of normal and abnormal log sequences. The sampling method can help select a number of seeds, which are used to mine likely numerical invariants. We evaluate the proposed approaches on three public datasets and manage to mine numerical invariants (workflow relations) from logs, which can be used to detect system anomalies effectively. In summary, this thesis develops approaches to automatically mine the two types of numerical invariants, and conducts experiments to evaluate their applications for software reliability improvement. The relations mined from program inputs and outputs can be used to assist bug detection in software testing, and those mined from logs can be used to monitor software status and detect anomalies.
- Subject
- numerical invariants; software reliability; software logs
- Identifier
- http://hdl.handle.net/1959.13/1504951
- Identifier
- uon:55601
- Rights
- Copyright 2022 Bo Zhang
- Language
- eng
- Full Text
- Hits: 755
- Visitors: 785
- Downloads: 34
Thumbnail | File | Description | Size | Format | |||
---|---|---|---|---|---|---|---|
View Details Download | ATTACHMENT01 | Thesis | 10 MB | Adobe Acrobat PDF | View Details Download | ||
View Details Download | ATTACHMENT02 | Abstract | 615 KB | Adobe Acrobat PDF | View Details Download |